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In the domain of Computing with words (CW), fuzzy linguistic approaches are known 
to be relevant in many decision-making problems. Indeed, they allow us to model the 
human reasoning in replacing words, assessments, preferences, choices, wishes. . . by ad hoc 
^J variables, such as fuzzy sets or more sophisticated variables. 

This paper focuses on a particular model: Herrera & Martinez' 2-tuple linguistic model 

I and their approach to deal with unbalanced linguistic term sets. It is interesting since 

the computations are accomplished without loss of information while the results of the 

r~^ decision-making processes always refer to the initial linguistic term set. They propose a 

Q\ fuzzy partition which distributes data on the axis by using linguistic hierarchies to manage 

the non-uniformity. However, the required input (especially the density around the terms) 

l/~) taken by their fuzzy partition algorithm may be considered as too much demanding in a 

—4- real-world application, since density is not always easy to determine. Moreover, in some 

limit cases (especially when two terms are very closed semantically to each other), the 

fC) partition doesn't comply with the data themselves, it isn't close to the reality. Therefore we 

y—i propose to modify the required input, in order to offer a simpler and more faithful partition. 

We have added an extension to the package jFuzzyLogic and to the corresponding script 

. „_, language FCL. This extension supports both 2-tuple models: Herrera & Martinez' and 

ours. In addition to the partition algorithm, we present two aggregation algorithms: the 

arithmetic means and the addition. We also discuss these kinds of 2-tuple models. 

1. INTRODUCTION 

Decision making is one of the most central human activities. The need of choosing 
between solutions in our complex world implies setting priorities on them considering 
multiple criteria such as benefits, risk, feasibility. . . The interest shown by scientists 
to Multi Criteria Decision Making (MCDM) problems, as the survey of Bana e Costa 
shows [5] , has led to the development of many MCDM approaches such as the Utility 
Theory, Bayesian Theory, Outranking Methods and the Analytic Hierarchy Process 
(AHP). But the main lack of these approaches is that they represent the preferences 
of the decision maker about a real-world problem in a crisp mathematical model. 
As we are dealing with human reasoning and preference modeling, qualitative data 
and linguistic variables may be more suitable to represent linguistic preferences and 
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their underlying aspects [5]. Martinez et al. have presented in |llj a wide list of 
applications to show the usability and the advantages that the linguistic information 
(using various linguistic computational models) produce in decision making. The 
preference extraction can be done thanks to elicitation strategies performed through 
User Interfaces (UIs) [4] and Natural Language Processing (NLP) [3] in a stimulus- 
response application for instance. 

In the literature, many approaches allow to model the linguistic preferences and 
the interpretation made of it such as the classical fuzzy approach from Zadeh [13] . 
Zadeh has introduced the notions of linguistic variable and granule [14] as basic 
concepts that underlie human cognition. In [7], the authors review the computing 
with words in Decision Making and explain that a granule "which is the denotation 
of a word (...) is viewed as a fuzzy constraint on a variable". 

Among the existing models, there is one that permits to deal with granularity and 
with linguistic assessments in a fuzzy way with a simple and regular representation: 
the fuzzy linguistic 2-tuples introduced by Herrera and Martinez [9]. Moreover, 
this model enables the representation of unbalanced linguistic data (i.e. the fuzzy 
sets representing the terms are not symetrically and uniformly distributed on their 
axis). However, in practice, the resulting fuzzy sets do not match exactly with 
human preferences. Now we know how crucial the selection of the membership 
functions is to determine the validity of a CW approach [llj . That is why an 
intermediate representation model is needed when we are dealing with data that are 
"very unbalanced" on the axis. 

The aim of this paper is to introduce another kind of fuzzy partition for unbal- 
anced term sets, based on the fuzzy linguistic 2-tuple model. Using the levels of 
linguistic hierarchies, a new algorithm is presented to improve the matching of the 
fuzzy partitioning. 

This paper is structured as follows. First, we shortly recall the fuzzy linguistic ap- 
proach and the 2-tuplc fuzzy linguistic representation model by Herrera & Martinez. 
In Section [3] we introduce a variant version of fuzzy linguistic 2-tuples and the cor- 
responding partitioning algorithm before presenting aggregation operators (Section 
H) . Then in Section p^ another extension of the model and a prospective application 
of this new kind of 2-tuples are discussed. We finally conclude with some remarks. 

2. THE 2-TUPLE FUZZY LINGUISTIC REPRESENTATION MODEL 

In this section we remind readers of the fuzzy linguistic approach, the 2-tuple fuzzy 
linguistic representation model and some related works. We also review some studies 
on the use of natural language processing in human computer interfaces. 

2.1. 2-tuples linguistic model and fuzzy partition 

Among the various fuzzy linguistic representation models, the approach that fits 
our needs the most is the representation that has been introduced by Herrera and 
Martinez in [§]. This model represents linguistic information by means of a pair 
(s, a), where s is a label representing the linguistic term and a is the value of the 
symbolic translation. The membership function of s is a triangular fuzzy set. 
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Let us note that in this paper we call a linguistic term a word (e.g. tall) and a 
label a symbol on the axis (i.e. an s). 

The computational model developed for this representation one includes compar- 
ison, negation and aggregation operators. By default, all triangular fuzzy sets are 
uniformly distributed on the axis, but the targeted aspects are not usually uniform. 
In such cases, the representation should be enhanced with tools such as unbalanced 
linguistic term sets which are not uniformly distributed on the axis [5]. To support 
the non- uniformity of the terms (we recall that the term set shall be unbalanced), the 
authors have chosen to change the scale granularity, instead of modifying the shape 
of the fuzzy sets. The key element that manages multigranular linguistic informa- 
tion is the level of a linguistic hierarchy, composed of an odd number of triangular 
fuzzy sets of the same shape, equally distributed on the axis, as a fuzzy partition in 
Ruspini's sense [12] . 

A linguistic hierarchy (LH ) is composed of several label sets of different levels 
(i.e., with different granularities). Each level of the hierarchy is denoted l(t,n(t)) 
where t is the level number and n(t) the number of labels (see Figure ft]). Thus, a 
linguistic label set S n ^' belonging to a level t of a linguistic hierarchy LH can be 
denoted 5 n (*) = {s^ , . . . , s™(J_ 1 }. In Figure 1 it should be noted that s| (bottom, 
plain and dotted line) is a bridge unbalanced label because it is not symmetric. 
Actually each label has two sides: the upside (left side) that is denoted Si and the 
downside (right side) that is denoted s$. Between two levels there are jumps so we 
have to bridge the unbalanced term to obtain a fuzzy partition. Both sides of a 
bridge unbalanced label belong to two different levels of hierarchy. 

Linguistic hierarchies are unions of levels and assume the following properties (10| : 

• levels are ordered according to their granularity; 

• the linguistic label sets have an odd number n(t); 

• the membership functions of the labels are all triangular; 

• labels are uniformly and symmetrically distributed on [0, 1]; 

• the first level is Z(l, 3), the second is 1(2,5), the third is 1(3,9), etc. 

Using the hierarchies, Herrera and Martinez have developed an algorithm that 
permits to partition data in a convenient way. 

This algorithm needs two inputs: the linguistic term set iSM (composed by the 
medium term denoted Sc, the set of terms on its left denoted Sl and the set of 
terms on its right denoted Sr) and the density of term distribution on each side. 
The density can be middle or extreme according to the user's choice. For example 
the description of S = {A,B,C,D,E,F,G,H,I} is {(2, extreme), 1,(6, extreme)} 
with S L = {A, B}, S c = {C} and S R = {D, E, F, G, H, I}. 



x Whcn talking about linguistic terms, S (calligraphic font) is used, otherwise S (normal font) 
is used. 
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Fig. 1. Unbalanced linguistic term sets: example of a 3 
level-partition 



2.2. Drawbacks of the 2-tuple linguistic model fuzzy partition in our 
context 



First, the main problem of this algorithm is the density. Since the user is not 
an expert, how could he manage to give the density? First, he should be able to 
understand notions of granularity and unbalanced scales. 

Second, it is compulsory to have an odd number of terms (c/. n(t)) in order 
to define a middle term (cf. Sc)- But it may happen that the parity shall not 
be fulfilled. For example, when talking about a GPS battery we can consider four 
levels: full, medium, low and empty. 

Last, the final result may be quite different from what was initially expected 
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because only a "small unbalance" is allowed. It means that even if the extreme 
density is chosen, it doesn't guarantee the obtention of a very thin granularity. Only 
two levels of density are allowed (middle or extreme) which can be a problem when 
considering distances such as: arrived, very closed, closed, out of reach. "Out of 
reach" needs a level of granularity quite different from the level for terms "arrived" , 
"very closed" and "closed" . 

As the fuzzy partition obtained by this approach does not always fit with the 
reality, we proposed in [T] a draft of approach to overcome this problem. This is 
further described in j2] where we mainly focus on the industrial context (geolocation) 
and the underlying problems addressed by our specific constraints. 

The implementations and tests made for this work are based on the jFuzzyLogic 
library. It is the most used fuzzy logic package by Java developers. It implements 
Fuzzy Control Language (FCL) specification (IEC 61131-7) and is available under 
the Lesser GNU Public Licence (LGPL). 

Even if it is not the main point of this paper, one part of our work is to provide an 
interactive tool in the form of a natural language dialogue interface. This dialogue, 
through an elicitation strategy, helps to extract the human preferences. We use NLP 
techniques to represent the grammatical, syntactical and semantic relations between 
the words used during the interaction part. Moreover, to be able to interpret these 
words, the NLP is associated to fuzzy linguistic techniques. Thus, fuzzy semantics 
are associated to each word which is supported by the interactive tool (especially 
adjectives such as "long", "short", "low", "high", etc.) and can be used at the 
interpretation time. This NLP-Fuzzy Linguistic association also enables to assign 
different semantics to the same word depending on the user's criteria (business do- 
main, context, etc.). It allows then to unify the words used in the dialogue interface 
for different use cases by only switching between their different semantics. 

Another interesting aspect of this NLP-fuzzy linguistic association lies in the 
possibility of an automatic semantic generation in a sort of autocompletion mode. 

For example, in a geolocation application, if the question is " When do you want 
to be notified?" , a user's answer can be "I want to be notified when the GPS battery 
level is low" . Here the user says low, so we propose a semantic distribution of the 
labels of the term set according to the number of the synonyms of this term. Indeed, 
the semantic relations between words introduced by NLP (synonyms, homonyms, 
opposites, etc.) can be used to highlight words associated with the term low se- 
mantically and then to construct a linguistic label set around it. The more relevant 
words found for a term, the higher the density of labels is around it. In comparison 
with the 2-tuple fuzzy linguistic model introduced by Herrera & al., this amounts 
to deduce the density (in Herrera & Martinez' sense) according to the number of 
synonyms of a term. In practice, thanks to a synonym dictionary it is possible to 
compute a semantic distance between each term given by the geolocation expert. If 
two terms are considered as synonymous they will share the same LH. Moreover, 
a word with few (or no) synonyms will be represented in a coarse-grained hierarchy 
while a word with many synonyms will be represented in a fine-grained hierarchy. 

We can see here how much the unbalanced linguistic label sets can be relevant in 
many situations. To couple NLP techniques and fuzzy linguistic models seems very 
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BAC VI 




0,000 0,025 0,050 0,075 0,100 0,125 0,150 0,175 0,200 0,225 0,250 0,275 0,300 



k NoAlcohol « YoungLegalLimrt * Intermediate :: LegalLimit RiskOfDeath 



Fig. 2. The ideal fuzzy partition for the BAC example. 



promising. 



3. TOWARDS ANOTHER KIND OF 2-TUPLES LINGUISTIC MODEL 

Starting from a running example, we now present our proposal that aims at avoiding 
the drawbacks mentioned above. 



3.1. Running example 

Herrera & Martinez' methodology needs a term set S and an associated description 
with two densities. For instance, when considering the blood alcohol concentration 
(BAC in percentage) in the USA, we can focus on five main values: 0% means 
no alcohol, .05% is the legal limit for drivers under 21, .065% is an intermediate 
value (illegal for young drivers but legal for the others), .08% is the legal limit for 
drivers older than 21 and .3% is considered as the BAC level where risk of death is 
possible. In particular, the ideal partition should comply with the data and with the 
gap between values (see Figure v2\ that simply proposes triangular fuzzy sets without 
any real semantics, obtained directly from the input values). But this prevents us 
from using the advantages of Herrera & Martinez' method, that are mainly to keep 
the original semantics of the terms, i.e. to keep the same terms from the original 
linguistic term set. The question is how to express linguistically the results of the 
computations if the partition doesn't fulfill "good" properties such as those from the 
2-tuple linguistic model? 



3.2. Extension of jFuzzyLogic and preliminary definitions 

With Herrera & Martinez' method, we have 

S = {NoAlcohol, Young LegalLimit, Intermediate, LegalLimit, RiskOfDeath} and its 
description is {(3, extreme), 1, (1, extreme)} with Sl = {NoAlcohol, YoungLegalLimit, 
Intermediate}, Sc = {LegalLimit} and Sr = {RiskOfDeath} . 
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jFuzzyLogic extension (we have added the management of Herrera & Martinez' 2- 
tuple linguistic model) helps modeling this information and we obtain the following 
FCL script: 
VAR_ INPUT 

BloodAlcoholConcentration : LING; 
END_VAR 

FUZZIFY BloodAlcoholConcentration 

TERM S := ling NoAlcohol YoungLegalLimit 

Intermediate I LegalLimit I RiskOf Death, 

extreme extreme ; 
END_FUZZIFY 

The resulting fuzzy partition is quite different from what was initially expected 
(see Figure [3] compared to Figure [2] where we notice that the label unbalance is 
not really respected). We recall that each label s, has two sides. For instance, the 
label Si associated to NoAlcohol has a downside and no upside while the term Sj 
associated to RiskOfDeath has an upside and no downside. 



HOT VI 





0,000 0,025 0,050 0,075 0,100 0,125 0,150 0,175 0,200 0,225 0,250 0,275 0,300 

X 

* NoAlcohol * YoungLegalLimit •Intermediate LegalLimit RiskOfDeath 



Fig. 3. Fuzzy partition generated by Herrera & Martinez' approach. 



Two problems appear: the use of densities is not always obvious for final users, 
and the gaps between values (especially between LegalLimit and RiskOfDeath) are 
not respected. 

To avoid the use of the densities that can be hard to obtain from the user (e.g., 
see the specific geolocation industrial context explained in [2]), we have evoked in [1] 
a tentative approach which offers a simpler way to retrieve unbalanced linguistic 
terms. The aim was to accept any kind of description of the terms coming from 
the user. That is why we propose an extension of jFuzzyLogic to handle linguistic 
2-tuples in addition to an enrichment of the FCL language specification. Conse- 
quently, we suggest another way to define a TERM with a new type of variable called 
LING (see the example below). 
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VAR_ INPUT 

BloodAlcoholConcentration : LING; 
END_VAR 

FUZZIFY BloodAlcoholConcentration 

TERM S := ling (NoAlcohol ,0.0) (YoungLegalLimit , . 05) 

(Intermediate, 0.065) (LegalLimit ,0.08) (RiskOf Death, .3) ; 
END_FUZZIFY 

It should be noted that the linguistic values are composed by a pair (s, v) where 
s is a linguistic term {e.g., LegalLimit) and v is a number giving the position of s on 
the axis {e.g., 0.08). Thus several definitions can now be given. 

Definition 3.1. Let S be an unbalanced ordered linguistic term set and U be the 

numerical universe where the terms are projected. Each linguistic value is defined by 
a unique pair (s, v) e5x(7. The numerical distance between Sj and Sj+i is denoted 
by di with di = v^+i — v.;. 

Definition 3.2. Let S = {so,...,s p } be an unbalanced linguistic label set and 
(si,a) be a linguistic 2-tuple. To support the unbalance, S is extended to several 
balanced linguistic label sets, each one denoted £"■(*) = {sq , . . . , s™L! x } (obtained 
from the algorithm of [TO]) defined in the level t of a linguistic hierarchy LH with 



n{t) labels. There is a unique way to go from S (Definition 3.1 1 to S, according to 
Algorithm [T] 

Definition 3.3. Let l{t,n{t)) be a level from a linguistic hierarchy. The grain g of 
l{t,n{t)) is defined as the distance between two 2-tuples (s™ , a). 

Proposition 3.4. The grain g of a level Z(£, n(t)) is obtained as: gutnlt)) = l/( n W — 
1). 

Proof, g is defined as the distance between (s™ ,a) and (s™+{, a), i.e., between 
two kernels of the associated triangular fuzzy sets because a equals 0. Since the 
hierarchy is normalized on [0, 1], this distance is easy to compute using A -1 operator 
from [TU] where A _1 (s" (t) ,a) = ^py + a = ^rpj- As a result, gi(t, n {t)) = 

M^:WFT = V«t) - 1). □ 

For instance, the grain of the second level is 3/(2,5) = -25. 

Proposition 3.5. The grain g of a level l(t — 1, n(t — 1)) is twice the grain of the 
level l{t,n{t): 3i(t-i,n(t-i)) = 2 9i(t,n(t)) 

Proof. This comes from the following property of the linguistic hierarchies. Let 
l{t, n{t)) be a level. Its successor is defined as: l(t + 1, 2n{t) — 1) (see [8]). □ 



Towards an extension of the 2-tuple model 



Algorithm 1 Partitioning algorithm 



Require: ((s , v ), . . . , (Sj,_i, v p _i)) are p pairs of S X U; 
t, to, . . . , t p _i are levels of hierarchies 
1: scale the linguistic hierarchies on [0,v ma J, with \i max the maximum v value 
precompute r\ levels and their grain g (77 > 6) 
for k — to p — 1 do 
dk <- Vfc+i - v fc 
for t = r\ to 1 do 

if 9i(t,n(t)) - d k then 

t* <-* 
end if 
end for 

imp = v max 

for i = to n(tfc) 1 do 

if tmp> |A- 1 (s" (tfc) ,0)-v fc | then 
imp=|A- 1 ( S r (4fc) ,0)-v fc | 

i ^- i 

end if 
end for 



„"•(**) j_ „«(*fe) . "(tfc) j_ n(t k ) 
Ik t— A j ) *fe+l t— 6 j + l 



18: depending on the level, otk = Vfc — A 1 (s™ fc , 0) or 

a A ^I = v fe+1 + A- 1 ( S ;ti ) '°) 
19: end for 

20: return the set {( sg (to) ,ao), (s" (to) ,al), ( s" (tl) ,ai), . . . , 

/ n(t p _ 2 ) s / n(t p _ 2 ) — \i 

(Sp-2 i <Xp-2 ), \ 3 p-l ,Ctp-l)i 



3.3. A new partitioning 

The aim of the partitioning is to assign a label s" (indeed one or two) to each 
term Sfc. The selection of s" depends on both the distance dk and the numeri- 
cal value Vfc. We look for the nearest level — they are all known in advance, see 
Table 1 in [5] -- i.e., for the level with the closest grain from dk- Then the right 
s™ is chosen to match v^. with the best accuracy, i has to minimize the quantity 
min l |A- 1 ( Sl rl(tfc) ,0)-v fc |. 

By default, the linguistic hierarchies are distributed on [0, 1], so a scaling is needed 
in order that they match the universe U. 

The detail of these different steps is given in Algorithm [T] We notice that there 
is no condition on the parity of the number of terms. Besides, the function returns 
a set of bridge unbalanced linguistic 2-tuples with a level of granularity that may 
not be the same for the upside than for the downside. 

Herrera & Martinez' partitioning does not follow exactly the user wishes be- 
cause it transforms them into a model with many properties, such as Ruspini condi- 
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tions [12] . As for us, we try to match the wishes as best as possible by adding lateral 
translations a to the labels s™ . From this, it results a possible non-fulfillment 
of the previous properties. For instance, what we obtain is not a fuzzy partition. 
But we assume to do without these conditions since the goal is to totally cover the 
universe. This is guaranteed by the minimal covering property. 

Proposition 3.6. The 2-tuples (s™ , a) (from several levels l(t,n(t))) obtained 
from our partitioning algorithm are triangular fuzzy sets that cover the entire uni- 
verse U. 

Actually, the distance between any pair ((s^ , aik), (Sfc+;p Ofc+i)) is always strictly 
greater than twice the grain of the corresponding level. 

Proof. By definition and construction, dk is used to choose the convenient level 
t for this pair. We recall that when t decreases, gi(t,n(t)) increases. As a result, we 
have: 

9l{t,n{t)) <dk< gi(t-l,n(t-l)) (1) 

After having applied the steps of the assignation process we obtain two linguistic 

2-tuples (Sfc , Qifc) and (s£\_ ^au+x) representing the downside and upside of labels 

s^ and s^_j_ j respectively. 

Thanks to the symbolic translations a, the distance between the kernel of these 
two 2-tuples is dk- Then, according to Proposition |3.5| and to Equation[T]we conclude 
that: 

dk < ^gi(t,n(t)) (2) 

which means that, for each value in U , this fuzzy partition has a minimum member- 
ship value £ strictly greater than 0. 

Considering fj, „<*) the membership function associated with a label s™ , this 
property is denoted: 

Vu€U, «,( f0 )(u)V-V«,(« j )(tt)V-V/j,( Vl) (u)>O0 (3) 

s o s i s p _ 1 

□ 

To illustrate this work, we take the running example concerning the BAC. The set 
of pairs (s, v) is the following: {(NoAlcohol, .0), ( YoungLegalLimit, .05) (Intermediate, .065) 
(LegalLimit, .08) (RiskOfDeath, .3)}. 

It should be noted that our algorithm implies to add another level of hierarchy: 
1(0,2). 

We denote by L and R the upside and downside of labels respectively. Table 1 
shows the results, with a values not normalized. To normalize them, it is easy to 
see that they have to be multiplied by 1/.3 because v m ax = -3. 

See Figure [3] for a graphical representation of the fuzzy partition. 
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linguistic term 


level 


2-tuple 


NoAlcohoLR 


1(3,9) 


(*8,o) 


Young Leg alLimit_L 


1(3,9) 


(*?,.0125) 


Young Leg alLimit-R 


1(5,33) 


(sf,.003) 


Intermediated 


1(5,33) 


(«M 


Intermediate _R 


*(4,17) 


W,o) 


LegalLimitJL 


1(4, 17) 


(«i'',.oo5) 


LegalLimit_R 


1(1,3) 


W.-.07) 


RiskOfDeath_R 


1(1,3) 


(*?,0) 



Table 1. The 2-tuple set for the BAC example. 



BAC V3 




0,000 0,025 0,050 0,075 0,100 0,125 0,150 0,175 0,200 0,225 0,250 0,275 0,300 

X 

■ NoAkohol_R ■ YoungLegalLimit_L ■ YoungLegalLimit_R * Intermediate^ lntermediate_R * LegalLimit_L LegalLimit_R RiskOfDeathjl 



Fig. 4. Fuzzy partition generated by our algorithm for the BAC 

example. 



4. AGGREGATION WITH OUR 2-TUPLES 

4.1. Arithmetic mean 

As our representation model is based on the 2-tuple fuzzy linguistic one, we can 
use the aggregation operators (weighted average, arithmetic mean, etc.) of the 
unbalanced linguistic computational model introduced in [8]. The functions A, A -1 , 
CH and CUT used in our aggregation are derived from the same functions in Herrera 
& Martinez' computational model. 

In the aggregation process, linguistic terms (sfe, Vfe) belonging to a linguistic term 
set S have to be dealt with. After the assignation process, these terms are associated 
to one or two 2-tuples (s™ , aj) (remember the upside and downside of a label) of 
a level from a linguistic hierarchy LH . We recall two definitions taken from [5]. 

Definition 4.1. CH is the transformation function that associates with each lin- 
guistic 2-tuple expressed in LH its respective unbalanced linguistic 2-tuple. 
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Definition 4.2. Let S = {s , . . . , s g } be a linguistic label set and € [0, g] a value 
supporting the result of a symbolic aggregation operation. Then the linguistic 2- 
tuple that expresses the equivalent information to is obtained with the function 

A : [0,g] — > S x [-.5, .5), such that 

., , _ J Si i = round(0) 

W ~ \ a =0-i ae [-.5, .5) 

where s, has the closest index label to and a is the value of the symbolic translation. 

Thus the aggregation process (arithmetic mean) can be summarized by the three 
following steps: 

1. Apply the aggregation operator to the v values of the linguistic terms. Let 
be the result of this aggregation. 

2. Use the A function to obtain the (sLa q ) 2-tuple of LH corresponding to 0. 

3. In order to express the resulting 2-tuple in the initial linguistic term set S, we 
use the CUT function as defined in |8j to obtain the linguistic pair (sj,vj). 



To illustrate the aggregation process, we suppose that we want to aggregate two 
terms (two pairs (s, v)) of our running example concerning the BAC: (YoungLegal- 
Limit, .05) and (LegalLimit, .08). In this example we use the arithmetic mean as 
aggregation operator. 

Using our representation algorithm, the term ( YoungLegalLimit, .05) is associated 
to (sf, .125) and (s| 3 ,.003) and (LegalLimit, .08) is associated to (s4 7 ,.005) and 

(sf , —.07). First, we apply the arithmetic means to the v value of the two terms. As 
these values are in absolute scale, it simplifies the computations. The result of the 
aggregation is (3 — .065. 

The second step is to represent the linguistic information of aggregation by a 
linguistic label expressed in LH. For the representation we choose the level associ- 
ated to the two labels with the finest grain. In our example it is 1(5, 33) (fifth level 
of LH with n(t) = 33). Then we apply the A function on to obtain the result: 
A(.065) = (sf ,-.001). 

Finally, in order to express the above result in the initial linguistic term set S, 
we apply the CH.~ X function. It associates to a linguistic 2-tuple in LH its corre- 
sponding linguistic term in S. Thus, we obtain the final result CH~ l ((s^ 3 , —.001)) = 
(YoungLegalLimit, .005). 

Given that countries have different rules concerning the BAC for drivers, the ag- 
gregation of such linguistic information can be relevant to calculate an average value 
of allowed and prohibited blood alcohol concentration levels for a set of countries 
(Europe, Africa, etc.). 



Towards an extension of the 2-tuple model 13 



4.2. Addition 

As we are using an absolute scale on the axis for our linguistic terms, the approach 
for other operators is the same as the one described above for the arithmetic means 
aggregation. We first apply the operator to the v values of the linguistic terms and 
then we use the A and the CH functions successively to express the result in the 
original term set. 

If we consider for instance that, this time, we need to add the two following terms: 
(YoungLegalLimit, .05) and (LegalLimit, .08), we denote {Young Leg alLimit, .05) © 
(LegalLimit, .08) and proceed as follows: 

• We add the two v values .05 and .08 to obtain /3 = .13. 

• We then apply the A function to express (3 in LH, A(0.13) = (af|, —.001). 

• Finally, we apply the CH~ function to obtain the result expressed in the 
initial linguistic term set S : £% ((sff, — .001)) = (LegalLimit, .05). 

This © addition looks like a fuzzy addition operator (see e.g. [9]) used as a basis 
for many aggregation processes (combine experts' preferences, etc.). Actually, © 
operator can be seen as an extension (in the sense of Zadeh's principle extension) of 
the addition for our 2-tuplcs. 

The same approach can be applied to other operators. It will be further explored 
in our future works. 

5. DISCUSSIONS 

5.1. Towards a fully linguistic model 

When dealing with linguistic tools, the aim is to avoid the user to supply precise 
numbers, since he's not always able to give them. Thus, in the pair (s, v) that 
describes the data, it may happen that the user doesn't know exactly the position 
v. 

For instance, considering five grades (A,B,C,D,E), the user knows that (i) D 
and E are fail grades, (ii) A is the best one, (iii) B is not far away, (iv) C is in the mid- 
dle. If we replace v by a linguistic term, that is a stretch factor, the five pairs in the 
previous example could be: (A, VeryStuck); (B, Far); (C, Stuck); (D, Moderately Stuck); 
(E,N/A) (see Figure 151. (A, VeryStuck) means that A is very stuck to its next label. 
(E,N/A) means that E is the last label (v value is not applicable). 

This improvement permits to ask the user for: 

• either the pairs (s, v) , with v a linguistic term (stretch factor) ; 

• or only the labels s while placing them on a visual scale (i.e., the stretch factors 
are automatically computed to obtain the pairs (s, v)); 

• or the pairs (s, v), with v a numerical value, as proposed above. 
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A B CD E 

(A, VeryStuck); (B, Far); (C, Stuck); (D, Moderately Stuck); (E,N/A) 



Fig. 5. Example of the use of a stretch factor 



It should be noted that the first case ensures to deal with fully linguistic pairs 
(s.v). It should also be noted that our stretch factor looks like Herrera & Martinez' 
densities, but in our case, it permits to construct a more accurate representation of 
the terms. 



5.2. Towards a simplification of binary trees 

The linguistic 2-tuple model that uses the pair (s™ , a) and its corresponding level 
of linguistic hierarchy can be seen as another way to express the various nodes of 
a tree. There is a parallel to draw between the node depth and the level of the 
linguistic hierarchy. Indeed, let us consider a binary tree, to simplify. The root node 
belongs to the first level, that is 1(1, 3) according to [10 . Then its children belong to 
the second one (1(2, 5)), knowing that the next level is obtained from its predecessor: 
l(n+l, 2n(t) — 1). And so on, for each node, until there is no node left. In the simple 
case of a binary tree (i.e., a node has two children or no child), it is easy to give the 
position — the 2-tuple (s™ , a) — of each node: this position is unique, left child 
is on the left of its parent in the next level (resp. right for the right child). 

The algorithm that permits to simplify a binary tree in a linguistic 2-tuple set is 
now given (see Algorithm [2| . If we consider the graphical example of Figure fsT the 
linguistic 2-tuple set we obtain is the following (ordered by level): 
{(s\ , 0), (si , 0), (4,0), (40), (4, 0), (si 7 , 0), {all, 0)}, where a *- («?, 0), b «- ( s f , 0), 
c <- (s|,0), d <- (s|,0), e <- (4,0), / <- (4 7 ,0) and g <- (s\l,0). The last 
graph of the figure shows the semantics obtained, using the representation algorithm 
described in [8]. 

In a way, this algorithm permits to flatten a binary tree into a 2-tuple set which 
can be useful to express distances between nodes. The opposite is also true: a 
linguistic term set can be expressed through a binary tree. One of the advantages to 
perform this flattening is to consider a new dimension in the data of a given problem. 
This new dimension is the distance between the possible outcomes (the nodes that 
can be decisions, choices, preferences, etc.) of the problem and this would allow 
for a ranking of the outcomes, as if we had a B-tree. The fact that the level of 
the linguistic hierarchy is not the same, depending on the node depth, is interesting 
since it gives a different granularity level, and, as with Zadeh's granules, it permits 
to connect a position in the tree and a precision level. 
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Algorithm 2 Simplification algorithm 

Require: o is a node, T is a binary tree, d is the root node of T 



1 


</<-(*§, 0) 


2 


for each node o E T, o ^ o' do 


3 


let (s\ , fc) be the parent node of o 


4 


if o is a left child then 


5 


o^(sttl,0) 


6 


else 


7 


o<-(*%£,0) 


8 


end if 


9 


end for 





return the set of linguistic 2-tuples, one per node 



6. CONCLUDING REMARKS 

In this paper, we have formally introduced and discussed an approach to deal with 
unbalanced linguistic term sets. Our approach is inspired by the 2-tuple fuzzy lin- 
guistic representation model from Herrera and Martinez, but we fully take advantage 
of the symbolic translations a that become a very important element to generate 
the data set. 

The 2-tuples of our linguistic model are twofold. Indeed, except the first one and 
the last one of the partition that have a shape of right-angled triangles, they all 
are composed of two half 2-tuples: an upside and a downside 2-tuple. The upside 
and downside of the 2-tuple are not necessary expressed in the same hierarchy nor 
level. Regarding the partitioning phase, there is no need to have all the symbolic 
translations equal to zero. This permits to express the non-uniformity of the data 
much better. 

Despite the changes we made, the minimal cover property is fulfilled and proved. 
Moreover, the aggregation operators that we redefine give consistent and satisfac- 
tory results. Next steps in future work will be to study other operators, such as 
comparison, negation, aggregation, implication, etc. 
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